Goto

Collaborating Authors

 Myrtle Beach


Why and When LLM-Based Assistants Can Go Wrong: Investigating the Effectiveness of Prompt-Based Interactions for Software Help-Seeking

arXiv.org Artificial Intelligence

Large Language Model (LLM) assistants, such as ChatGPT, have emerged as potential alternatives to search methods for helping users navigate complex, feature-rich software. LLMs use vast training data from domain-specific texts, software manuals, and code repositories to mimic human-like interactions, offering tailored assistance, including step-by-step instructions. In this work, we investigated LLM-generated software guidance through a within-subject experiment with 16 participants and follow-up interviews. We compared a baseline LLM assistant with an LLM optimized for particular software contexts, SoftAIBot, which also offered guidelines for constructing appropriate prompts. We assessed task completion, perceived accuracy, relevance, and trust. Surprisingly, although SoftAIBot outperformed the baseline LLM, our results revealed no significant difference in LLM usage and user perceptions with or without prompt guidelines and the integration of domain context. Most users struggled to understand how the prompt's text related to the LLM's responses and often followed the LLM's suggestions verbatim, even if they were incorrect. This resulted in difficulties when using the LLM's advice for software tasks, leading to low task completion rates. Our detailed analysis also revealed that users remained unaware of inaccuracies in the LLM's responses, indicating a gap between their lack of software expertise and their ability to evaluate the LLM's assistance. With the growing push for designing domain-specific LLM assistants, we emphasize the importance of incorporating explainable, context-aware cues into LLMs to help users understand prompt-based interactions, identify biases, and maximize the utility of LLM assistants.


Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions

arXiv.org Artificial Intelligence

Chain-of-thought (CoT) prompting has been shown to empirically improve the accuracy of large language models (LLMs) on various question answering tasks. While understanding why CoT prompting is effective is crucial to ensuring that this phenomenon is a consequence of desired model behavior, little work has addressed this; nonetheless, such an understanding is a critical prerequisite for responsible model deployment. We address this question by leveraging gradient-based feature attribution methods which produce saliency scores that capture the influence of input tokens on model output. Specifically, we probe several open-source LLMs to investigate whether CoT prompting affects the relative importances they assign to particular input tokens. Our results indicate that while CoT prompting does not increase the magnitude of saliency scores attributed to semantically relevant tokens in the prompt compared to standard few-shot prompting, it increases the robustness of saliency scores to question perturbations and variations in model output.


Large Language Models Based Automatic Synthesis of Software Specifications

arXiv.org Artificial Intelligence

Software configurations play a crucial role in determining the behavior of software systems. In order to ensure safe and error-free operation, it is necessary to identify the correct configuration, along with their valid bounds and rules, which are commonly referred to as software specifications. As software systems grow in complexity and scale, the number of configurations and associated specifications required to ensure the correct operation can become large and prohibitively difficult to manipulate manually. Due to the fast pace of software development, it is often the case that correct software specifications are not thoroughly checked or validated within the software itself. Rather, they are frequently discussed and documented in a variety of external sources, including software manuals, code comments, and online discussion forums. Therefore, it is hard for the system administrator to know the correct specifications of configurations due to the lack of clarity, organization, and a centralized unified source to look at. To address this challenge, we propose SpecSyn a framework that leverages a state-of-the-art large language model to automatically synthesize software specifications from natural language sources. Our approach formulates software specification synthesis as a sequence-to-sequence learning problem and investigates the extraction of specifications from large contextual texts. This is the first work that uses a large language model for end-to-end specification synthesis from natural language texts. Empirical results demonstrate that our system outperforms prior the state-of-the-art specification synthesis tool by 21% in terms of F1 score and can find specifications from single as well as multiple sentences.


Can AI answer your money questions? We put chatbots to the test

#artificialintelligence

NEW YORK, April 13 (Reuters) - Face it, we could all use a little help with our money. So who better to ask for personal finance advice than a couple of the most powerful chatbots on the planet? Both OpenAI's ChatGPT and Google's Bard are dominating headlines recently, for their generative capabilities and vast storehouses of information. Each has far more processing power than, say, any individual personal finance writer (ahem). What is one great business idea?


How AI Could Change the Highly-Skilled Job Market

#artificialintelligence

When most people think of the connection between technology and jobs, they think of robots and automation taking over relatively unskilled jobs like factory work. And thus, the biggest toll from these technological advances would be on already hard-hit manufacturing regions of the Rust Belt. But a new wave of developments in artificial intelligence may have a greater effect on high-skilled jobs and high-tech knowledge regions. The study by Mark Muro, Jacob Whiton, and Robert Maxim takes a close look at the potential of artificial intelligence--or AI--to automate tasks that until now have required human intelligence and decision-making. As they put it: "Unlike robotics (associated with the factory floor) and computers (associated with routine office activities), AI has a distinctly white-collar bent."


Automation and AI sound similar, but may have vastly different impacts on the future of work

#artificialintelligence

Last November, Brookings published a report on artificial intelligence's impact on the workplace that immediately raised eyebrows. Many readers, journalists, and even experts were perplexed by the report's primary finding: that, for the most part, it is better-paid, better-educated white-collar workers who are most exposed to AI's potential economic disruption. This conclusion--by authors Mark Muro, Robert Maxim, and Jacob Whiton--seemed to fly in the face of the popular understanding of technology's future effects on workers. For years, we've been hearing about how these advancements will force mainly blue-collar, lower-income workers out of jobs, as robotics and technology slowly consume those industries. In an article about the November report, The Mercury News outlined this discrepancy: "The study released Wednesday by the Brookings Institution seems to contradict findings from previous studies--including Brookings' own--that showed lower-skilled workers will be most affected by robots and automation, which can involve AI."


Feature and TV films

Los Angeles Times

Mr. Smith Goes to Washington 1939 TCM Tue. 7 p.m. Mean Streets 1973 Cinemax Sun. 6 a.m. Batman Begins 2005 AMC Sun. Throw Momma From the Train 1987 EPIX Sun. Die Hard 1988 IFC Sun. I Know What You Did Last Summer 1997 Starz Tue. Gone in 60 Seconds 2000 CMT Wed. 8 p.m., Thur. Total Recall 1990 Encore Thur. 2 a.m. A Fish Called Wanda 1988 Encore Thur. 2 p.m., 9 p.m. The World Is Not Enough 1999 EPIX Sat. 4 p.m. Look Who's Talking 1989 OVA Sun. Die Hard With a Vengeance 1995 IFC Thur. Oil-platform workers, including an estranged couple, and a Navy SEAL make a startling deep-sea discovery. A clueless politician falls in love with a waitress whose erratic behavior is caused by a nail stuck in her head. After glimpsing his future, an ambitious politician battles the agents of Fate itself to be with the woman he loves. To help a friend, a suburban baby sitter drives into downtown Chicago with her two charges and a neighbor. Two teenage baby sitters and a group of children spend a wild night ...


In These Small Cities, AI Advances Could Be Costly

MIT Technology Review

It's long been clear that urbanization and automated technologies are shaping society, but it hasn't been obvious how the two forces affect each other. A new study from MIT's Media Lab posits that the smaller the city, the greater the impact it faces from automation. The finding, they say, could encourage legislators to pay special attention to workers in smaller cities and offer them support services. Other researchers have attempted to measure the effect of technology on employment in cities, but the Media Lab authors, who have identified which jobs and skills tend to be more prevalent in smaller cities and larger ones, claim to be the first to explain why different U.S. cities are more susceptible (or resilient) to technological unemployment. They say that bigger cities have a disproportionately large number of jobs for people who do cognitive and analytical tasks, such as software developers and financial analysts--occupations that are less likely to be disrupted by automation.


What Is CamperForce? Amazon's Nomadic Retiree Army

WIRED

In the spring of 1960, just after he turned 16, Chuck Stout went to work as a "garbage boy" at a McDonald's in Toledo, Ohio. For 85 cents an hour, he swept and mopped the floors, kept the drive-in lot tidy, filled the shake machine, and washed dishes. It was an escape--somewhere to go that wasn't the Weiler Homes public housing complex, where he lived with his mother and sister. They were barely scraping by. "My mom drank so much," he says, "she didn't know what I was doing." Not only did Chuck love his job, the job loved him. He went from garbage boy to french fry maker to burger cook to cashier. He became a manager, then a supervisor, then a field consultant, then a professor at Hamburger University, where McDonald's trains new franchise owners and managers. By 1976, Chuck was serving as a director of product development for the entire corporation. The next year, he was on the team that brought ice cream sundaes to the chain's menu. For the effort, Chuck was rewarded with a handsome bonus and a personal letter from founder Ray Kroc, whose wisdom Chuck was fond of quoting from memory. Chuck eventually got fed up with corporate culture and told his superiors he wanted to go back out "in the field." When two planes hit the World Trade Center in 2001, he was 57 and running his own McDonald's franchise in Columbia, Pennsylvania. He rushed to Manhattan, where for three days he loaded up Egg McMuffins, hash browns, and coffee, first onto a luggage trolley, then a golf cart, and hauled them down to the debris pit to feed rescuers.


Feature and TV films

Los Angeles Times

The Lost World: Jurassic Park 1997 AMC Sun. Tomorrow Never Dies 1997 EPIX Wed. 10 p.m., Thur. The X-Files: Fight the Future 1998 IFC Thur. Hard to Kill 1990 Sundance Mon. 8 p.m., Tue. A scientist gives his bodyguard superhuman powers in order to fight racists. A lawyer unwittingly becomes friends with an unstable woman who has a criminal history. A successful businesswoman puts her family, career and life on the line to satisfy her addiction to sex. With his father trapped in the wreckage of their spacecraft, a youth treks across Earth's now-hostile terrain to recover their rescue beacon and signal for help. In the future a cutting-edge android in the form of a boy embarks on a journey to discover his true nature. An 11-year-old boy experiences the worst day of his young life but soon learns that he's not alone when other members of his family encounter their own calamities. A struggling writer falls in love with a stenographer while trying to finish his new novel in 30 days.